Overview

Dataset statistics

Number of variables15
Number of observations236142
Missing cells52575
Missing cells (%)1.5%
Duplicate rows1747
Duplicate rows (%)0.7%
Total size in memory27.0 MiB
Average record size in memory120.0 B

Variable types

Numeric8
Categorical7

Alerts

Dataset has 1747 (0.7%) duplicate rowsDuplicates
order_date is highly overall correlated with coupon_usageHigh correlation
delivery_fee is highly overall correlated with delivery_time_in_secondsHigh correlation
delivery_time_in_seconds is highly overall correlated with delivery_feeHigh correlation
nb_menu_items is highly overall correlated with restaurant_id and 1 other fieldsHigh correlation
coupon_usage is highly overall correlated with order_dateHigh correlation
restaurant_category is highly overall correlated with restaurant_id and 3 other fieldsHigh correlation
restaurant_type is highly overall correlated with restaurant_categoryHigh correlation
province is highly overall correlated with restaurant_categoryHigh correlation
restaurant_id is highly overall correlated with restaurant_category and 1 other fieldsHigh correlation
cooking_time_in_seconds has 25966 (11.0%) missing valuesMissing
delivery_time_in_seconds has 20804 (8.8%) missing valuesMissing
nb_menu_items has 5805 (2.5%) missing valuesMissing
cooking_time_in_seconds is highly skewed (γ1 = 92.33622733)Skewed
delivery_time_in_seconds is highly skewed (γ1 = 64.5824559)Skewed
delivery_fee has 91802 (38.9%) zerosZeros
food_price has 6378 (2.7%) zerosZeros

Reproduction

Analysis started2022-12-03 10:02:27.899026
Analysis finished2022-12-03 10:02:43.212883
Duration15.31 seconds
Software versionpandas-profiling vv3.5.0
Download configurationconfig.json

Variables

order_date
Real number (ℝ)

Distinct365
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean212.94699
Minimum1
Maximum365
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 MiB
2022-12-03T17:02:43.260154image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile29
Q1122
median228
Q3310
95-th percentile356
Maximum365
Range364
Interquartile range (IQR)188

Descriptive statistics

Standard deviation106.79004
Coefficient of variation (CV)0.50148649
Kurtosis-1.1787103
Mean212.94699
Median Absolute Deviation (MAD)92
Skewness-0.30514585
Sum50285729
Variance11404.113
MonotonicityIncreasing
2022-12-03T17:02:43.332251image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
358 1284
 
0.5%
363 1274
 
0.5%
360 1269
 
0.5%
359 1244
 
0.5%
361 1230
 
0.5%
335 1217
 
0.5%
362 1215
 
0.5%
356 1194
 
0.5%
342 1179
 
0.5%
320 1170
 
0.5%
Other values (355) 223866
94.8%
ValueCountFrequency (%)
1 398
0.2%
2 465
0.2%
3 415
0.2%
4 413
0.2%
5 396
0.2%
6 439
0.2%
7 391
0.2%
8 393
0.2%
9 423
0.2%
10 371
0.2%
ValueCountFrequency (%)
365 1121
0.5%
364 1088
0.5%
363 1274
0.5%
362 1215
0.5%
361 1230
0.5%
360 1269
0.5%
359 1244
0.5%
358 1284
0.5%
357 1145
0.5%
356 1194
0.5%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
Saturday
35554 
Sunday
35310 
Friday
34420 
Monday
33743 
Tuesday
32735 
Other values (2)
64380 

Length

Max length9
Median length8
Mean length7.1193816
Min length6

Characters and Unicode

Total characters1681185
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMonday
2nd rowMonday
3rd rowMonday
4th rowMonday
5th rowMonday

Common Values

ValueCountFrequency (%)
Saturday 35554
15.1%
Sunday 35310
15.0%
Friday 34420
14.6%
Monday 33743
14.3%
Tuesday 32735
13.9%
Thursday 32650
13.8%
Wednesday 31730
13.4%

Length

2022-12-03T17:02:43.394857image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-03T17:02:43.458684image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
saturday 35554
15.1%
sunday 35310
15.0%
friday 34420
14.6%
monday 33743
14.3%
tuesday 32735
13.9%
thursday 32650
13.8%
wednesday 31730
13.4%

Most occurring characters

ValueCountFrequency (%)
a 271696
16.2%
d 267872
15.9%
y 236142
14.0%
u 136249
8.1%
r 102624
 
6.1%
n 100783
 
6.0%
s 97115
 
5.8%
e 96195
 
5.7%
S 70864
 
4.2%
T 65385
 
3.9%
Other values (7) 236260
14.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1445043
86.0%
Uppercase Letter 236142
 
14.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 271696
18.8%
d 267872
18.5%
y 236142
16.3%
u 136249
9.4%
r 102624
 
7.1%
n 100783
 
7.0%
s 97115
 
6.7%
e 96195
 
6.7%
t 35554
 
2.5%
i 34420
 
2.4%
Other values (2) 66393
 
4.6%
Uppercase Letter
ValueCountFrequency (%)
S 70864
30.0%
T 65385
27.7%
F 34420
14.6%
M 33743
14.3%
W 31730
13.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 1681185
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 271696
16.2%
d 267872
15.9%
y 236142
14.0%
u 136249
8.1%
r 102624
 
6.1%
n 100783
 
6.0%
s 97115
 
5.8%
e 96195
 
5.7%
S 70864
 
4.2%
T 65385
 
3.9%
Other values (7) 236260
14.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1681185
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 271696
16.2%
d 267872
15.9%
y 236142
14.0%
u 136249
8.1%
r 102624
 
6.1%
n 100783
 
6.0%
s 97115
 
5.8%
e 96195
 
5.7%
S 70864
 
4.2%
T 65385
 
3.9%
Other values (7) 236260
14.1%

order_hour
Real number (ℝ)

Distinct24
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.544892
Minimum0
Maximum23
Zeros1345
Zeros (%)0.6%
Negative0
Negative (%)0.0%
Memory size1.8 MiB
2022-12-03T17:02:43.510483image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile9
Q111
median14
Q318
95-th percentile21
Maximum23
Range23
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.0859192
Coefficient of variation (CV)0.2809178
Kurtosis-0.10332688
Mean14.544892
Median Absolute Deviation (MAD)3
Skewness-0.16435209
Sum3434660
Variance16.694736
MonotonicityNot monotonic
2022-12-03T17:02:43.555732image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
11 27374
11.6%
12 24302
10.3%
18 21337
9.0%
13 18967
 
8.0%
19 18260
 
7.7%
17 17293
 
7.3%
14 16315
 
6.9%
10 16203
 
6.9%
15 14084
 
6.0%
16 14011
 
5.9%
Other values (14) 47996
20.3%
ValueCountFrequency (%)
0 1345
 
0.6%
1 364
 
0.2%
2 195
 
0.1%
3 159
 
0.1%
4 190
 
0.1%
5 60
 
< 0.1%
6 751
 
0.3%
7 2147
 
0.9%
8 5106
2.2%
9 8959
3.8%
ValueCountFrequency (%)
23 2902
 
1.2%
22 5047
 
2.1%
21 7983
 
3.4%
20 12788
5.4%
19 18260
7.7%
18 21337
9.0%
17 17293
7.3%
16 14011
5.9%
15 14084
6.0%
14 16315
6.9%

delivery_status
Categorical

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
COMPLETED
215367 
CANCELED_BY_USER
 
10464
CANCELED_BY_RESTAURANT
 
7573
EXPIRED_BY_DRIVER
 
1941
FAILED
 
246
Other values (13)
 
551

Length

Max length25
Median length9
Mean length9.8051935
Min length6

Characters and Unicode

Total characters2315418
Distinct characters24
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowCOMPLETED
2nd rowCOMPLETED
3rd rowCOMPLETED
4th rowCOMPLETED
5th rowCOMPLETED

Common Values

ValueCountFrequency (%)
COMPLETED 215367
91.2%
CANCELED_BY_USER 10464
 
4.4%
CANCELED_BY_RESTAURANT 7573
 
3.2%
EXPIRED_BY_DRIVER 1941
 
0.8%
FAILED 246
 
0.1%
CANCELED_BY_DRIVER 233
 
0.1%
CANCELED_BY_CS 192
 
0.1%
DROP_OFF_DONE 88
 
< 0.1%
QUOTED 8
 
< 0.1%
PICK_UP_FAILED 8
 
< 0.1%
Other values (8) 22
 
< 0.1%

Length

2022-12-03T17:02:43.609355image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
completed 215367
91.2%
canceled_by_user 10464
 
4.4%
canceled_by_restaurant 7573
 
3.2%
expired_by_driver 1941
 
0.8%
failed 246
 
0.1%
canceled_by_driver 233
 
0.1%
canceled_by_cs 192
 
0.1%
drop_off_done 88
 
< 0.1%
pick_up_failed 8
 
< 0.1%
quoted 8
 
< 0.1%
Other values (8) 22
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
E 492166
21.3%
C 252506
10.9%
D 238420
10.3%
L 234083
10.1%
T 230561
10.0%
P 217416
9.4%
O 215654
9.3%
M 215368
9.3%
_ 41038
 
1.8%
A 33885
 
1.5%
Other values (14) 144321
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2274380
98.2%
Connector Punctuation 41038
 
1.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 492166
21.6%
C 252506
11.1%
D 238420
10.5%
L 234083
10.3%
T 230561
10.1%
P 217416
9.6%
O 215654
9.5%
M 215368
9.5%
A 33885
 
1.5%
R 32049
 
1.4%
Other values (13) 112272
 
4.9%
Connector Punctuation
ValueCountFrequency (%)
_ 41038
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2274380
98.2%
Common 41038
 
1.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 492166
21.6%
C 252506
11.1%
D 238420
10.5%
L 234083
10.3%
T 230561
10.1%
P 217416
9.6%
O 215654
9.5%
M 215368
9.5%
A 33885
 
1.5%
R 32049
 
1.4%
Other values (13) 112272
 
4.9%
Common
ValueCountFrequency (%)
_ 41038
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2315418
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 492166
21.3%
C 252506
10.9%
D 238420
10.3%
L 234083
10.1%
T 230561
10.0%
P 217416
9.4%
O 215654
9.3%
M 215368
9.3%
_ 41038
 
1.8%
A 33885
 
1.5%
Other values (14) 144321
 
6.2%

payment_method
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
CASH
191968 
LINEMAN_CREDIT_CARD
23154 
RLP
21020 

Length

Max length19
Median length4
Mean length5.3817534
Min length3

Characters and Unicode

Total characters1270858
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCASH
2nd rowCASH
3rd rowCASH
4th rowCASH
5th rowCASH

Common Values

ValueCountFrequency (%)
CASH 191968
81.3%
LINEMAN_CREDIT_CARD 23154
 
9.8%
RLP 21020
 
8.9%

Length

2022-12-03T17:02:43.663223image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-03T17:02:43.713911image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
cash 191968
81.3%
lineman_credit_card 23154
 
9.8%
rlp 21020
 
8.9%

Most occurring characters

ValueCountFrequency (%)
C 238276
18.7%
A 238276
18.7%
S 191968
15.1%
H 191968
15.1%
R 67328
 
5.3%
I 46308
 
3.6%
N 46308
 
3.6%
E 46308
 
3.6%
_ 46308
 
3.6%
D 46308
 
3.6%
Other values (4) 111502
8.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1224550
96.4%
Connector Punctuation 46308
 
3.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 238276
19.5%
A 238276
19.5%
S 191968
15.7%
H 191968
15.7%
R 67328
 
5.5%
I 46308
 
3.8%
N 46308
 
3.8%
E 46308
 
3.8%
D 46308
 
3.8%
L 44174
 
3.6%
Other values (3) 67328
 
5.5%
Connector Punctuation
ValueCountFrequency (%)
_ 46308
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1224550
96.4%
Common 46308
 
3.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 238276
19.5%
A 238276
19.5%
S 191968
15.7%
H 191968
15.7%
R 67328
 
5.5%
I 46308
 
3.8%
N 46308
 
3.8%
E 46308
 
3.8%
D 46308
 
3.8%
L 44174
 
3.6%
Other values (3) 67328
 
5.5%
Common
ValueCountFrequency (%)
_ 46308
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1270858
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 238276
18.7%
A 238276
18.7%
S 191968
15.1%
H 191968
15.1%
R 67328
 
5.3%
I 46308
 
3.6%
N 46308
 
3.6%
E 46308
 
3.6%
_ 46308
 
3.6%
D 46308
 
3.6%
Other values (4) 111502
8.8%

coupon_usage
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
COUPON USED
193529 
NO COUPON
42613 

Length

Max length11
Median length11
Mean length10.63909
Min length9

Characters and Unicode

Total characters2512336
Distinct characters9
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO COUPON
2nd rowCOUPON USED
3rd rowNO COUPON
4th rowNO COUPON
5th rowNO COUPON

Common Values

ValueCountFrequency (%)
COUPON USED 193529
82.0%
NO COUPON 42613
 
18.0%

Length

2022-12-03T17:02:43.760836image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-03T17:02:43.811838image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
coupon 236142
50.0%
used 193529
41.0%
no 42613
 
9.0%

Most occurring characters

ValueCountFrequency (%)
O 514897
20.5%
U 429671
17.1%
N 278755
11.1%
C 236142
9.4%
P 236142
9.4%
236142
9.4%
S 193529
 
7.7%
E 193529
 
7.7%
D 193529
 
7.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2276194
90.6%
Space Separator 236142
 
9.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O 514897
22.6%
U 429671
18.9%
N 278755
12.2%
C 236142
10.4%
P 236142
10.4%
S 193529
 
8.5%
E 193529
 
8.5%
D 193529
 
8.5%
Space Separator
ValueCountFrequency (%)
236142
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2276194
90.6%
Common 236142
 
9.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
O 514897
22.6%
U 429671
18.9%
N 278755
12.2%
C 236142
10.4%
P 236142
10.4%
S 193529
 
8.5%
E 193529
 
8.5%
D 193529
 
8.5%
Common
ValueCountFrequency (%)
236142
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2512336
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O 514897
20.5%
U 429671
17.1%
N 278755
11.1%
C 236142
9.4%
P 236142
9.4%
236142
9.4%
S 193529
 
7.7%
E 193529
 
7.7%
D 193529
 
7.7%

delivery_fee
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct385
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.205973
Minimum0
Maximum1182
Zeros91802
Zeros (%)38.9%
Negative0
Negative (%)0.0%
Memory size1.8 MiB
2022-12-03T17:02:43.857274image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median10
Q315
95-th percentile73
Maximum1182
Range1182
Interquartile range (IQR)15

Descriptive statistics

Standard deviation30.912053
Coefficient of variation (CV)1.9074482
Kurtosis48.825425
Mean16.205973
Median Absolute Deviation (MAD)10
Skewness4.8844479
Sum3826910.9
Variance955.55505
MonotonicityNot monotonic
2022-12-03T17:02:43.918098image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 91802
38.9%
10 45572
19.3%
15 20556
 
8.7%
5 19845
 
8.4%
20 16067
 
6.8%
29 4262
 
1.8%
25 3927
 
1.7%
40 2885
 
1.2%
50 2800
 
1.2%
62 2538
 
1.1%
Other values (375) 25888
 
11.0%
ValueCountFrequency (%)
0 91802
38.9%
1 1010
 
0.4%
2 299
 
0.1%
3 22
 
< 0.1%
4 19
 
< 0.1%
5 19845
 
8.4%
6 323
 
0.1%
7 47
 
< 0.1%
8 119
 
0.1%
9 1234
 
0.5%
ValueCountFrequency (%)
1182 1
 
< 0.1%
697 5
< 0.1%
691 2
 
< 0.1%
689 1
 
< 0.1%
681 9
< 0.1%
596 1
 
< 0.1%
585 1
 
< 0.1%
575 1
 
< 0.1%
573 1
 
< 0.1%
571 2
 
< 0.1%

food_price
Real number (ℝ)

Distinct1814
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean196.77864
Minimum0
Maximum7001
Zeros6378
Zeros (%)2.7%
Negative0
Negative (%)0.0%
Memory size1.8 MiB
2022-12-03T17:02:43.981182image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile50
Q180
median138
Q3230
95-th percentile566
Maximum7001
Range7001
Interquartile range (IQR)150

Descriptive statistics

Standard deviation206.84037
Coefficient of variation (CV)1.0511322
Kurtosis26.920898
Mean196.77864
Median Absolute Deviation (MAD)65
Skewness3.8469509
Sum46467701
Variance42782.937
MonotonicityNot monotonic
2022-12-03T17:02:44.037613image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
60 8642
 
3.7%
120 7056
 
3.0%
100 6986
 
3.0%
0 6378
 
2.7%
50 6357
 
2.7%
55 5532
 
2.3%
65 5321
 
2.3%
70 5306
 
2.2%
110 5184
 
2.2%
150 4836
 
2.0%
Other values (1804) 174544
73.9%
ValueCountFrequency (%)
0 6378
2.7%
1 1
 
< 0.1%
9 1
 
< 0.1%
10 5
 
< 0.1%
13 1
 
< 0.1%
14 1
 
< 0.1%
15 4
 
< 0.1%
18.69 2
 
< 0.1%
20 16
 
< 0.1%
24 31
 
< 0.1%
ValueCountFrequency (%)
7001 1
< 0.1%
5196 1
< 0.1%
4577 1
< 0.1%
4165 1
< 0.1%
3560 1
< 0.1%
2997 1
< 0.1%
2978 1
< 0.1%
2960 1
< 0.1%
2885 2
< 0.1%
2837 1
< 0.1%

cooking_time_in_seconds
Real number (ℝ)

MISSING
SKEWED

Distinct3366
Distinct (%)1.6%
Missing25966
Missing (%)11.0%
Infinite0
Infinite (%)0.0%
Mean782.58559
Minimum0
Maximum156541
Zeros3
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.8 MiB
2022-12-03T17:02:44.101056image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile258
Q1468
median681
Q3980
95-th percentile1640
Maximum156541
Range156541
Interquartile range (IQR)512

Descriptive statistics

Standard deviation612.56911
Coefficient of variation (CV)0.78275031
Kurtosis21350.077
Mean782.58559
Median Absolute Deviation (MAD)244
Skewness92.336227
Sum1.6448071 × 108
Variance375240.91
MonotonicityNot monotonic
2022-12-03T17:02:44.160097image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
461 295
 
0.1%
541 289
 
0.1%
526 282
 
0.1%
556 281
 
0.1%
518 280
 
0.1%
546 279
 
0.1%
538 279
 
0.1%
613 278
 
0.1%
494 277
 
0.1%
577 276
 
0.1%
Other values (3356) 207360
87.8%
(Missing) 25966
 
11.0%
ValueCountFrequency (%)
0 3
< 0.1%
1 2
< 0.1%
2 3
< 0.1%
3 3
< 0.1%
4 1
 
< 0.1%
5 3
< 0.1%
6 4
< 0.1%
7 4
< 0.1%
8 3
< 0.1%
9 4
< 0.1%
ValueCountFrequency (%)
156541 1
< 0.1%
76956 1
< 0.1%
53582 1
< 0.1%
36755 1
< 0.1%
15809 1
< 0.1%
14321 1
< 0.1%
12204 1
< 0.1%
8979 1
< 0.1%
8168 1
< 0.1%
7199 1
< 0.1%

delivery_time_in_seconds
Real number (ℝ)

HIGH CORRELATION
MISSING
SKEWED

Distinct2964
Distinct (%)1.4%
Missing20804
Missing (%)8.8%
Infinite0
Infinite (%)0.0%
Mean541.57356
Minimum0
Maximum101097
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.8 MiB
2022-12-03T17:02:44.224926image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile137
Q1317
median468
Q3671
95-th percentile1161
Maximum101097
Range101097
Interquartile range (IQR)354

Descriptive statistics

Standard deviation496.37308
Coefficient of variation (CV)0.9165386
Kurtosis10277.512
Mean541.57356
Median Absolute Deviation (MAD)171
Skewness64.582456
Sum1.1662137 × 108
Variance246386.23
MonotonicityNot monotonic
2022-12-03T17:02:44.286179image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3 667
 
0.3%
4 483
 
0.2%
2 459
 
0.2%
403 416
 
0.2%
375 407
 
0.2%
353 402
 
0.2%
347 399
 
0.2%
385 397
 
0.2%
335 394
 
0.2%
369 392
 
0.2%
Other values (2954) 210922
89.3%
(Missing) 20804
 
8.8%
ValueCountFrequency (%)
0 1
 
< 0.1%
1 26
 
< 0.1%
2 459
0.2%
3 667
0.3%
4 483
0.2%
5 272
0.1%
6 201
 
0.1%
7 142
 
0.1%
8 135
 
0.1%
9 126
 
0.1%
ValueCountFrequency (%)
101097 1
< 0.1%
55967 1
< 0.1%
53985 1
< 0.1%
50289 1
< 0.1%
45512 1
< 0.1%
44363 1
< 0.1%
23329 1
< 0.1%
19957 1
< 0.1%
12787 1
< 0.1%
12584 1
< 0.1%

restaurant_id
Real number (ℝ)

Distinct200
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean98.985221
Minimum1
Maximum200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 MiB
2022-12-03T17:02:44.348176image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile12
Q155
median104
Q3142
95-th percentile179
Maximum200
Range199
Interquartile range (IQR)87

Descriptive statistics

Standard deviation52.467492
Coefficient of variation (CV)0.5300538
Kurtosis-1.0562857
Mean98.985221
Median Absolute Deviation (MAD)41
Skewness-0.12205706
Sum23374568
Variance2752.8378
MonotonicityNot monotonic
2022-12-03T17:02:44.406579image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
158 4883
 
2.1%
129 4879
 
2.1%
75 4566
 
1.9%
39 4547
 
1.9%
123 4388
 
1.9%
152 4338
 
1.8%
174 4164
 
1.8%
122 4021
 
1.7%
55 3999
 
1.7%
86 3966
 
1.7%
Other values (190) 192391
81.5%
ValueCountFrequency (%)
1 617
 
0.3%
2 320
 
0.1%
3 1662
0.7%
4 1646
0.7%
5 2370
1.0%
6 854
 
0.4%
7 993
0.4%
8 884
 
0.4%
9 299
 
0.1%
10 372
 
0.2%
ValueCountFrequency (%)
200 219
 
0.1%
199 897
0.4%
198 321
 
0.1%
197 220
 
0.1%
196 187
 
0.1%
195 216
 
0.1%
194 354
 
0.1%
193 480
0.2%
192 194
 
0.1%
191 839
0.4%
Distinct33
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
À La Carte
26977 
Rice Dish
24427 
North East
22397 
Café/Coffee Shop
20377 
Noodles
20143 
Other values (28)
121821 

Length

Max length28
Median length20
Mean length10.579338
Min length4

Characters and Unicode

Total characters2498226
Distinct characters46
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRice Dish
2nd rowRice Dish
3rd rowDelivery Only
4th rowStreet Food/Food Stands
5th rowRice Dish

Common Values

ValueCountFrequency (%)
À La Carte 26977
11.4%
Rice Dish 24427
 
10.3%
North East 22397
 
9.5%
Café/Coffee Shop 20377
 
8.6%
Noodles 20143
 
8.5%
Thai 18977
 
8.0%
Bakery/Cake 11532
 
4.9%
Dessert 10741
 
4.5%
Bubble Milk Tea 8713
 
3.7%
Delivery Only 8409
 
3.6%
Other values (23) 63449
26.9%

Length

2022-12-03T17:02:44.466898image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
rice 30173
 
7.1%
à 26977
 
6.3%
la 26977
 
6.3%
carte 26977
 
6.3%
dish 24427
 
5.7%
north 22397
 
5.3%
east 22397
 
5.3%
café/coffee 20377
 
4.8%
shop 20377
 
4.8%
noodles 20143
 
4.7%
Other values (40) 185085
43.4%

Most occurring characters

ValueCountFrequency (%)
e 279377
 
11.2%
a 205745
 
8.2%
190165
 
7.6%
o 178387
 
7.1%
i 138156
 
5.5%
t 124111
 
5.0%
r 118582
 
4.7%
s 112057
 
4.5%
h 109815
 
4.4%
C 87113
 
3.5%
Other values (36) 954718
38.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1772415
70.9%
Uppercase Letter 479264
 
19.2%
Space Separator 190165
 
7.6%
Other Punctuation 56382
 
2.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 279377
15.8%
a 205745
11.6%
o 178387
10.1%
i 138156
 
7.8%
t 124111
 
7.0%
r 118582
 
6.7%
s 112057
 
6.3%
h 109815
 
6.2%
f 82106
 
4.6%
l 60219
 
3.4%
Other values (14) 363860
20.5%
Uppercase Letter
ValueCountFrequency (%)
C 87113
18.2%
S 53705
11.2%
N 52307
10.9%
D 43577
9.1%
L 32723
 
6.8%
T 31115
 
6.5%
R 30173
 
6.3%
B 29950
 
6.2%
À 26977
 
5.6%
E 22397
 
4.7%
Other values (10) 69227
14.4%
Space Separator
ValueCountFrequency (%)
190165
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 56382
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2251679
90.1%
Common 246547
 
9.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 279377
 
12.4%
a 205745
 
9.1%
o 178387
 
7.9%
i 138156
 
6.1%
t 124111
 
5.5%
r 118582
 
5.3%
s 112057
 
5.0%
h 109815
 
4.9%
C 87113
 
3.9%
f 82106
 
3.6%
Other values (34) 816230
36.2%
Common
ValueCountFrequency (%)
190165
77.1%
/ 56382
 
22.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2450872
98.1%
None 47354
 
1.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 279377
 
11.4%
a 205745
 
8.4%
190165
 
7.8%
o 178387
 
7.3%
i 138156
 
5.6%
t 124111
 
5.1%
r 118582
 
4.8%
s 112057
 
4.6%
h 109815
 
4.5%
C 87113
 
3.6%
Other values (34) 907364
37.0%
None
ValueCountFrequency (%)
À 26977
57.0%
é 20377
43.0%

restaurant_type
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
NON-CHAIN
194085 
CHAIN_RESTAURANT
42057 

Length

Max length16
Median length9
Mean length10.246703
Min length9

Characters and Unicode

Total characters2419677
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNON-CHAIN
2nd rowCHAIN_RESTAURANT
3rd rowNON-CHAIN
4th rowNON-CHAIN
5th rowNON-CHAIN

Common Values

ValueCountFrequency (%)
NON-CHAIN 194085
82.2%
CHAIN_RESTAURANT 42057
 
17.8%

Length

2022-12-03T17:02:44.516661image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-03T17:02:44.565610image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
non-chain 194085
82.2%
chain_restaurant 42057
 
17.8%

Most occurring characters

ValueCountFrequency (%)
N 666369
27.5%
A 320256
13.2%
C 236142
 
9.8%
H 236142
 
9.8%
I 236142
 
9.8%
O 194085
 
8.0%
- 194085
 
8.0%
R 84114
 
3.5%
T 84114
 
3.5%
_ 42057
 
1.7%
Other values (3) 126171
 
5.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2183535
90.2%
Dash Punctuation 194085
 
8.0%
Connector Punctuation 42057
 
1.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 666369
30.5%
A 320256
14.7%
C 236142
 
10.8%
H 236142
 
10.8%
I 236142
 
10.8%
O 194085
 
8.9%
R 84114
 
3.9%
T 84114
 
3.9%
E 42057
 
1.9%
S 42057
 
1.9%
Dash Punctuation
ValueCountFrequency (%)
- 194085
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 42057
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2183535
90.2%
Common 236142
 
9.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 666369
30.5%
A 320256
14.7%
C 236142
 
10.8%
H 236142
 
10.8%
I 236142
 
10.8%
O 194085
 
8.9%
R 84114
 
3.9%
T 84114
 
3.9%
E 42057
 
1.9%
S 42057
 
1.9%
Common
ValueCountFrequency (%)
- 194085
82.2%
_ 42057
 
17.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2419677
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 666369
27.5%
A 320256
13.2%
C 236142
 
9.8%
H 236142
 
9.8%
I 236142
 
9.8%
O 194085
 
8.0%
- 194085
 
8.0%
R 84114
 
3.5%
T 84114
 
3.5%
_ 42057
 
1.7%
Other values (3) 126171
 
5.2%

province
Categorical

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
Bangkok
126936 
Pathum Thani
30071 
Samut Prakan
28624 
Nonthaburi
28482 
Samut Sakhon
14972 

Length

Max length13
Median length7
Mean length9.100952
Min length7

Characters and Unicode

Total characters2149117
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNakhon Pathom
2nd rowBangkok
3rd rowBangkok
4th rowSamut Sakhon
5th rowNakhon Pathom

Common Values

ValueCountFrequency (%)
Bangkok 126936
53.8%
Pathum Thani 30071
 
12.7%
Samut Prakan 28624
 
12.1%
Nonthaburi 28482
 
12.1%
Samut Sakhon 14972
 
6.3%
Nakhon Pathom 7057
 
3.0%

Length

2022-12-03T17:02:44.606715image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-03T17:02:44.658260image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
bangkok 126936
40.1%
samut 43596
 
13.8%
pathum 30071
 
9.5%
thani 30071
 
9.5%
prakan 28624
 
9.0%
nonthaburi 28482
 
9.0%
sakhon 14972
 
4.7%
nakhon 7057
 
2.2%
pathom 7057
 
2.2%

Most occurring characters

ValueCountFrequency (%)
a 345490
16.1%
k 304525
14.2%
n 236142
11.0%
o 184504
8.6%
B 126936
 
5.9%
g 126936
 
5.9%
h 117710
 
5.5%
t 109206
 
5.1%
u 102149
 
4.8%
m 80724
 
3.8%
Other values (8) 414795
19.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1751527
81.5%
Uppercase Letter 316866
 
14.7%
Space Separator 80724
 
3.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 345490
19.7%
k 304525
17.4%
n 236142
13.5%
o 184504
10.5%
g 126936
 
7.2%
h 117710
 
6.7%
t 109206
 
6.2%
u 102149
 
5.8%
m 80724
 
4.6%
i 58553
 
3.3%
Other values (2) 85588
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
B 126936
40.1%
P 65752
20.8%
S 58568
18.5%
N 35539
 
11.2%
T 30071
 
9.5%
Space Separator
ValueCountFrequency (%)
80724
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2068393
96.2%
Common 80724
 
3.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 345490
16.7%
k 304525
14.7%
n 236142
11.4%
o 184504
8.9%
B 126936
 
6.1%
g 126936
 
6.1%
h 117710
 
5.7%
t 109206
 
5.3%
u 102149
 
4.9%
m 80724
 
3.9%
Other values (7) 334071
16.2%
Common
ValueCountFrequency (%)
80724
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2149117
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 345490
16.1%
k 304525
14.2%
n 236142
11.0%
o 184504
8.6%
B 126936
 
5.9%
g 126936
 
5.9%
h 117710
 
5.5%
t 109206
 
5.1%
u 102149
 
4.8%
m 80724
 
3.8%
Other values (8) 414795
19.3%

nb_menu_items
Real number (ℝ)

HIGH CORRELATION
MISSING

Distinct111
Distinct (%)< 0.1%
Missing5805
Missing (%)2.5%
Infinite0
Infinite (%)0.0%
Mean85.563557
Minimum1
Maximum969
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 MiB
2022-12-03T17:02:44.828235image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q134
median61
Q3106
95-th percentile301
Maximum969
Range968
Interquartile range (IQR)72

Descriptive statistics

Standard deviation82.752999
Coefficient of variation (CV)0.9671524
Kurtosis18.17982
Mean85.563557
Median Absolute Deviation (MAD)36
Skewness2.8469687
Sum19708453
Variance6848.0588
MonotonicityNot monotonic
2022-12-03T17:02:44.888297image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
61 9629
 
4.1%
41 7434
 
3.1%
118 7413
 
3.1%
65 6984
 
3.0%
21 6755
 
2.9%
58 6281
 
2.7%
46 5636
 
2.4%
63 5092
 
2.2%
38 4868
 
2.1%
25 4666
 
2.0%
Other values (101) 165579
70.1%
(Missing) 5805
 
2.5%
ValueCountFrequency (%)
1 1524
 
0.6%
2 210
 
0.1%
3 567
 
0.2%
4 1216
 
0.5%
5 4391
1.9%
6 1313
 
0.6%
7 1534
 
0.6%
8 1858
0.8%
9 564
 
0.2%
10 1228
 
0.5%
ValueCountFrequency (%)
969 313
 
0.1%
316 4566
1.9%
310 1989
0.8%
308 4164
1.8%
301 1180
 
0.5%
281 365
 
0.2%
258 2391
1.0%
232 2018
0.9%
219 652
 
0.3%
208 1646
 
0.7%

Interactions

2022-12-03T17:02:41.586924image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:37.992021image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:38.528040image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:39.007843image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:39.488493image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:40.088138image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:40.592339image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:41.070709image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:41.650581image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:38.070587image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:38.586669image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:39.068102image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:39.553481image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:40.153936image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:40.654494image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:41.136693image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:41.708601image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:38.132628image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:38.641255image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:39.123754image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:39.609996image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:40.212234image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:40.710409image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:41.196564image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:41.768316image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:38.195391image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:38.700419image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:39.181498image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:39.765897image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:40.271322image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:40.767682image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:41.260171image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:41.829690image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:38.264214image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:38.760508image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:39.242958image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:39.829674image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:40.335583image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:40.828534image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:41.325883image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:41.983304image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:38.331552image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:38.824850image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:39.306411image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:39.895479image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:40.400302image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:40.890462image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:41.395333image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:42.044812image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:38.394210image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:38.888752image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:39.363950image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:39.958860image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:40.461672image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:40.949213image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:41.456284image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:42.111082image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:38.467305image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:38.953254image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:39.429049image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:40.029344image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:40.527846image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:41.011238image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-03T17:02:41.524060image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Correlations

2022-12-03T17:02:44.943767image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Auto

The auto setting is an interpretable pairwise column metric of the following mapping:
  • Variable_type-Variable_type : Method, Range
  • Categorical-Categorical : Cramer's V, [0,1]
  • Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
  • Numerical-Numerical : Spearman's ρ, [-1,1]
The number of bins used in the discretization for the Numerical-Categorical column pair can be changed using config.correlations["auto"].n_bins. The number of bins affects the granularity of the association you wish to measure.

This configuration uses the recommended metric for each pair of columns.
2022-12-03T17:02:45.034817image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-12-03T17:02:45.108321image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-12-03T17:02:45.185417image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-12-03T17:02:45.259391image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-12-03T17:02:45.332413image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-12-03T17:02:42.296705image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-12-03T17:02:42.621368image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-12-03T17:02:43.104995image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

order_dateorder_day_of_weekorder_hourdelivery_statuspayment_methodcoupon_usagedelivery_feefood_pricecooking_time_in_secondsdelivery_time_in_secondsrestaurant_idrestaurant_categoryrestaurant_typeprovincenb_menu_items
01Monday12COMPLETEDCASHNO COUPON10.0100.0424.0691.063Rice DishNON-CHAINNakhon Pathom14.0
11Monday11COMPLETEDCASHCOUPON USED75.00.0NaN285.070Rice DishCHAIN_RESTAURANTBangkok66.0
21Monday9COMPLETEDCASHNO COUPON0.0130.0345.0602.0106Delivery OnlyNON-CHAINBangkok15.0
31Monday18COMPLETEDCASHNO COUPON20.080.0710.0367.027Street Food/Food StandsNON-CHAINSamut Sakhon14.0
41Monday8COMPLETEDCASHNO COUPON10.060.0659.0717.063Rice DishNON-CHAINNakhon Pathom14.0
51Monday18COMPLETEDCASHNO COUPON20.0414.0724.01222.033ChineseNON-CHAINBangkok23.0
61Monday9COMPLETEDCASHNO COUPON0.060.0950.03.027Street Food/Food StandsNON-CHAINSamut Sakhon14.0
71Monday11COMPLETEDCASHNO COUPON0.0160.0560.0984.063Rice DishNON-CHAINNakhon Pathom14.0
81Monday11COMPLETEDCASHNO COUPON0.0225.0491.0976.027Street Food/Food StandsNON-CHAINSamut Sakhon14.0
91Monday14COMPLETEDRLPNO COUPON10.0100.01117.0412.0123Rice DishNON-CHAINPathum Thani21.0
order_dateorder_day_of_weekorder_hourdelivery_statuspayment_methodcoupon_usagedelivery_feefood_pricecooking_time_in_secondsdelivery_time_in_secondsrestaurant_idrestaurant_categoryrestaurant_typeprovincenb_menu_items
236132365Monday17COMPLETEDCASHCOUPON USED94.0860.01163.01412.026ThaiNON-CHAINBangkok179.0
236133365Monday11CANCELED_BY_USERCASHCOUPON USED81.01100.0NaNNaN26ThaiNON-CHAINBangkok179.0
236134365Monday13COMPLETEDCASHCOUPON USED0.050.0346.0335.085Street Food/Food StandsNON-CHAINBangkok8.0
236135365Monday22COMPLETEDCASHCOUPON USED0.055.0523.0276.0183ThaiNON-CHAINPathum Thani23.0
236136365Monday14COMPLETEDCASHCOUPON USED0.080.0414.0248.098Bubble Milk TeaNON-CHAINBangkok25.0
236137365Monday15COMPLETEDCASHCOUPON USED137.0960.01464.01403.026ThaiNON-CHAINBangkok179.0
236138365Monday14COMPLETEDCASHCOUPON USED0.050.0207.0401.085Street Food/Food StandsNON-CHAINBangkok8.0
236139365Monday17COMPLETEDCASHCOUPON USED25.0326.0926.0658.040Steak House/BarbequeNON-CHAINNonthaburi119.0
236140365Monday17COMPLETEDCASHCOUPON USED0.0240.0709.0497.0185North EastNON-CHAINBangkok41.0
236141365Monday11COMPLETEDCASHCOUPON USED0.050.0287.0620.085Street Food/Food StandsNON-CHAINBangkok8.0

Duplicate rows

Most frequently occurring

order_dateorder_day_of_weekorder_hourdelivery_statuspayment_methodcoupon_usagedelivery_feefood_pricecooking_time_in_secondsdelivery_time_in_secondsrestaurant_idrestaurant_categoryrestaurant_typeprovincenb_menu_items# duplicates
397115Wednesday16EXPIRED_BY_DRIVERCASHCOUPON USED0.099.0NaNNaN12Bakery/CakeCHAIN_RESTAURANTBangkok21.020
914197Monday20CANCELED_BY_USERCASHCOUPON USED107.00.0NaNNaN17Moo KataNON-CHAINBangkok25.017
390115Wednesday16CANCELED_BY_USERCASHCOUPON USED0.099.0NaNNaN12Bakery/CakeCHAIN_RESTAURANTBangkok21.010
391115Wednesday16CANCELED_BY_USERCASHCOUPON USED10.099.0NaNNaN12Bakery/CakeCHAIN_RESTAURANTBangkok21.09
448119Sunday23CANCELED_BY_USERCASHCOUPON USED10.070.0NaNNaN79Rice DishNON-CHAINSamut Prakan160.09
399115Wednesday16EXPIRED_BY_DRIVERCASHCOUPON USED10.099.0NaNNaN12Bakery/CakeCHAIN_RESTAURANTBangkok21.08
977215Friday10CANCELED_BY_USERCASHCOUPON USED35.0145.0NaNNaN91Café/Coffee ShopCHAIN_RESTAURANTBangkok81.08
501124Friday19CANCELED_BY_USERCASHCOUPON USED10.055.0NaNNaN70Rice DishCHAIN_RESTAURANTBangkok66.07
1202295Monday11CANCELED_BY_USERCASHCOUPON USED29.0350.0NaNNaN149Bakery/CakeNON-CHAINBangkok42.07
24575Friday19CANCELED_BY_USERCASHNO COUPON50.0110.0NaNNaN122Northern FoodNON-CHAINBangkok38.06